feat(errors): structured what/why/fix errors + stable exit codes (error-system v2)#27
Merged
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d4de60aaa3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
alex-mextner
added a commit
that referenced
this pull request
Jun 17, 2026
Three deferred P2 findings from the PR #27 codex review, now fixed properly: 1. config.py — per-key LAYER PROVENANCE for unknown items. The loader now tracks which config FILE last declared each top-level key (key_sources, built in cascade order). An unknown item that came SOLELY from the global config is reported against ~/.config/rig/config.yaml, not the repo rig.yaml. plan.py's unknown-item errors resolve the source via config.source_for_key(). 2. missing_target.py — SKIP `-m module` invocations. `python3 -m my_hook --out x` runs a module by import name, not a script FILE; the scanner now treats `-m` like `-c` (interpreter mode, no path to verify), so a healthy module-based hook with an absolute --out arg no longer false-flags as a missing target. 3. cli.py — EXIT-CODE PRECEDENCE. A dead hook/target now surfaces its exit code (EXIT_MISSING_TARGET=5) even alongside config<->disk drift, instead of being masked by the drift exit (3). Both findings are still printed; missing-target outranks drift (it fails at runtime). Precedence documented in `rig --help`. Also makes tests/smoke.sh run pytest via `uv run --with pytest` when uv is available (the documented command), so the suite runs on a machine whose bare python3 lacks pytest. Tests: RED-first for each finding; full suite 504 -> 513 passing; smoke green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…or-system v2) Born from two same-day prod failures whose errors were thin and undiagnosable: `unknown mcp item(s): review (known: none)` (no hint it was a REMOVED slot) and a dead hook path surfacing only as a generic harness "PreToolUse error". This makes every rig error a consistent WHAT / WHY / HOW-to-fix block with a stable per-class exit code (a PUBLIC CONTRACT documented in `rig --help`). What's in: - riglib/errors.py: RigError(what/why/fix/exit_code) + per-class codes (2 config, 3 drift, 4 unknown-item, 5 missing-target, 6 not-a-repo, 127 missing-dep), a single top-level `guard()` renderer, did-you-mean (Levenshtein), and a removed-slot registry (mcp.items.review -> names agent-tools #32 + the fix). - riglib/layers.py: GLOBAL vs REPO classification — single source of truth so `rig status` groups drift by layer and names WHICH config file declares each item. - riglib/missing_target.py: proactively scans the harness settings.json for hook commands pointing at a script gone from disk (the dead-rtk-hook case). Resolves the invoked SCRIPT (skips bare/absolute interpreters like /usr/bin/env, honors `-c` only as an interpreter flag) and checks only that token, so a runtime output arg never false-flags. - cli.py: `rig status` renders the 3-part errors + layer grouping; a non-git dir shows "not a git repository — repo layer N/A" (never the "should be committed" nag), and degrades gracefully when the agent-tools catalog can't be resolved. plan.py raises the structured UnknownItemError. config.py adds primary_config_path for provenance. `rig doctor` honors the documented 127 missing-dependency exit code. Tests: 504 pytest passing; tests/smoke.sh fully green (incl. the non-git + dead-hook + removed-slot legs). New: test_errors, test_missing_target, test_plan_errors, test_status_layers; smoke gains non-git / clean-sample / removed-slot assertions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three deferred P2 findings from the PR #27 codex review, now fixed properly: 1. config.py — per-key LAYER PROVENANCE for unknown items. The loader now tracks which config FILE last declared each top-level key (key_sources, built in cascade order). An unknown item that came SOLELY from the global config is reported against ~/.config/rig/config.yaml, not the repo rig.yaml. plan.py's unknown-item errors resolve the source via config.source_for_key(). 2. missing_target.py — SKIP `-m module` invocations. `python3 -m my_hook --out x` runs a module by import name, not a script FILE; the scanner now treats `-m` like `-c` (interpreter mode, no path to verify), so a healthy module-based hook with an absolute --out arg no longer false-flags as a missing target. 3. cli.py — EXIT-CODE PRECEDENCE. A dead hook/target now surfaces its exit code (EXIT_MISSING_TARGET=5) even alongside config<->disk drift, instead of being masked by the drift exit (3). Both findings are still printed; missing-target outranks drift (it fails at runtime). Precedence documented in `rig --help`. Also makes tests/smoke.sh run pytest via `uv run --with pytest` when uv is available (the documented command), so the suite runs on a machine whose bare python3 lacks pytest. Tests: RED-first for each finding; full suite 504 -> 513 passing; smoke green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3e07a9c to
fe2a6a5
Compare
alex-mextner
added a commit
that referenced
this pull request
Jun 17, 2026
…ixture #27's clean-sample leg enumerates every default-ON category and disables it to assert zero false drift. tg_ctl (added by this PR) is default-ON, so its provision_tg_ctl action made the clean sample report drift (exit 3). Mirror how #29 added 'gitignore: {enabled: false}' and disable tg_ctl too.
alex-mextner
added a commit
that referenced
this pull request
Jun 17, 2026
* feat(rig): add tg_ctl config block + pure plist planning Add the tg_ctl config block (validate + plan) and the pure, effect-free TgCtlPlan that renders the ai.hyperide.tg-ctl.plist LaunchAgent XML byte-exact to the working hand-created file (sort_keys=False preserves the insertion order so a re-apply is a true no-op). Default-on, per-machine (GLOBAL layer), macOS-only. Mirrors the tmux block's schema style. boot:null and label:null resolve to their defaults (not bool(None)=False / str(None)="None"). Reviewed via multi-model `review`; findings addressed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(rig): provision + reconcile tg-ctl boot LaunchAgent Runner: _do_provision_tg_ctl writes the byte-exact plist, backs up a differing prior, ensures the log dir, tears down the stale predecessor (com.ultra.codex-tg-bot: bootout + timestamped backup + remove), and (re)loads via launchctl bootout/bootstrap in the gui/<uid> domain. A re-apply against the already-correct loaded plist is a skipped no-op. RIG_TG_CTL_DRY_RUN writes the plist but skips every live/destructive mutation (launchctl AND the stale teardown) so tests/smoke never touch the real launchd domain. Drift: _check_tg_ctl flags missing / divergent / written-but-not-loaded, a leftover plist when boot:false, and the stale predecessor (extra). CLI: GLOBAL status line shows installed / drifted / disabled / unsupported (off-darwin), resolved through the shared plan builder. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(rig): tg-ctl unit suite + HOME-isolated smoke leg test_tg_ctl.py mirrors test_tmux.py: config validation, byte-exact plist render (incl. against the live machine plist when present, read-only), create/idempotent/conflict/dry-run states, stale-predecessor teardown, drift (missing/modified/extra/not-loaded), status states, and the boot:null / label:null / dry-run-no-stale-removal / off-darwin regressions. conftest neutralizes the default-on tg_ctl provisioner + drift check and stubs the gui-domain launchctl seams suite-wide (dedicated tests restore the real ones with their own HOME-isolated tmp dirs); no test ever touches the real ~/Library/LaunchAgents or runs real launchctl. smoke.sh gains a focused, HOME-isolated, RIG_TG_CTL_DRY_RUN tg-ctl leg and prefers `uv run --extra test pytest`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(rig): document the tg_ctl config block docs/config-schema.md: the tg_ctl section (keys, defaults, the byte-exact no-op contract, gui-domain (re)load, stale-predecessor teardown, drift, the RIG_TG_CTL_DRY_RUN seam, and the enabled:false vs boot:false distinction) + the validation paragraph. AGENTS.md: refine the "never mutate a LIVE service" rule — the stateless background daemons (models cron, tg_ctl) are the documented (re)load exceptions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(rig): disable tg_ctl in the error-system-v2 clean-sample smoke fixture #27's clean-sample leg enumerates every default-ON category and disables it to assert zero false drift. tg_ctl (added by this PR) is default-ON, so its provision_tg_ctl action made the clean sample report drift (exit 3). Mirror how #29 added 'gitignore: {enabled: false}' and disable tg_ctl too. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
error-system v2 — structured what/why/fix errors + stable exit codes
Born from two same-day prod failures whose errors were thin and undiagnosable:
unknown mcp item(s): review (known: none)(no hint it was a removed slot) and adead hook path surfacing only as a generic harness "PreToolUse error". This makes every
rig error a consistent WHAT / WHY / HOW-to-fix block with a stable per-class exit
code (a public contract, documented in
rig --help).What's in
riglib/errors.py—RigError(what/why/fix/exit_code)+ per-class codes(
2config,3drift,4unknown-item,5missing-target,6not-a-repo,127missing-dep), a single top-levelguard()renderer, did-you-mean(Levenshtein-nearest), and a removed-slot registry (
mcp.items.review→ namesagent-tools feat(status): area-coverage summary — every reconciled area, not "mostly skills" #32 + the exact fix instead of "known: none").
riglib/layers.py— GLOBAL vs REPO classification, single source of truth sorig statusgroups drift by layer and names which config file declares each item.riglib/missing_target.py— proactively scans the harnesssettings.jsonforhook commands pointing at a script gone from disk (the dead-rtk-hook case). Resolves
the invoked script (skips bare/absolute interpreters like
/usr/bin/env,/opt/homebrew/bin/python3; honors-conly as an interpreter flag), checks only thattoken so a runtime output arg never false-flags.
cli.py— 3-part error rendering + layer-groupedrig status; a non-git dirshows
not a git repository — repo layer N/A(never the "should be committed" nag) anddegrades gracefully when the agent-tools catalog can't be resolved.
rig doctorhonorsthe documented
127missing-dependency exit code.plan.pyraises the structuredUnknownItemError;config.pyaddsprimary_config_pathfor error provenance.Verification
uv run --with pytest python -m pytest tests/→ 504 passed.bash tests/smoke.sh→ fully green (incl. the non-git, clean-sample, removed-slot,and dead-hook legs). Fixed a latent smoke bug: the non-git test dir was nested under the
smoke's own
git init-ed$TMP, so it wasn't actually non-git.review --staged(Opus / Codex); their findings on the newmodules were addressed (HOME-isolated the new status tests; tightened the missing-target
script resolution for absolute interpreters and
-c).Known follow-ups (out of scope for this PR; CTO to triage)
primary_config_pathnames the repo file even when thebad key came from the global layer.
rig statushas both drift and a dead target, itreturns
3(drift) and the5(missing-target) signal is masked.errors.ConfigErrorexists butconfig.ConfigErroris still what'sraised for malformed config (so YAML errors still render the old one-line form).
~/.claudehook is surfaced in normal status but not in the non-git + no-catalog path).
🤖 Generated with Claude Code